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Abstract 

The lattice Boltzmann method can be used to simulate flow through porous media with full 
geometrical resolution. With such a direct numerical simulation, it becomes possible to study 
fundamental effects which are difficult to assess either by developing macroscopic mathematical 
models or experiments. We first evaluate the lattice Boltzmann method with various boundary 
handling of the solid-wall and various collision operators to assess their suitability for large scale 
direct numerical simulation of porous media flow. A periodic pressure drop boundary condition 
is used to mimic the pressure driven flow through the simple sphere pack in a periodic domain. 
The evaluation of the method is done in the Darcy regime and the results are compared to a 
semi-analytic solution. Taking into account computational cost and accuracy, we choose the most 
efficient combination of the solid boundary condition and collision operator. We apply this method 
to perform simulations for a wide range of Reynolds numbers from Stokes flow over seven orders 
of magnitude to turbulent flow. Contours and streamlines of the flow field are presented to show 
the flow behavior in different flow regimes. Moreover, unknown parameters of the Forchheimer, 
the Barree-Conway and friction factor models are evaluated numerically for the considered flow 
regimes. 

Keywords: Fattice Boltzmann method, porous media, Forchheimer force, pore scale simulation, 
turbulence, Darcy, periodic pressure boundary 


1. Introduction 

The simulation of fluid flow and transport processes through porous media is an important 
field that contributes to a number of scientific and engineering applications, e.g., oil extraction, 
groundwater pollution, nuclear waste storage, and chemical reactors. The application of pore-scale 
simulations is challenging in most practical situations since the system under study is often by 
orders of magnitude larger than the characteristic size of the pores. Thus for practical purposes, 
many computational techniques are based on macroscopic models that average over many pores 
and consider average flow rates. 

For steady state flow at low Reynolds numbers a generally accepted and experimentally 
confirmed macroscopic relation between the pressure gradient and flow rate is given by Darcy’s 


Preprint submitted to Elsevier 


August 13, 2015 



law 


( 1 ) 


VP = -pKJ } 'U, 

where p is the fluid dynamic viscosity, K t) is a permeability tensor associated with the geometry 
of the porous medium under consideration, and U and P are the volume averaged velocity 
and pressure, respectively. This model, which was first found by Darcy m in experimental 
investigations, is valid in the regime Re p <s 1, where Re p pd p U/ii is the Reynolds number 
based on a characteristic pore-diameter d p , p denotes the density of the fluid, and U := |U • i 
is the scalar velocity in streamwise direction i. At larger pore Reynolds numbers, Forchheimer 
|2l observed a non-linear deviation from this relation, for which he proposed the addition of a 
quadratic term, namely 


VP = -p/f- 1 U-/Jp|U|U, (2) 

where /3 is the constant inertial factor proposed by Forchheimer, which mainly depends on the flow 
path and is usually found experimentally. Replacing the (j factor by a dimensionless Forchheimer 
constant as f3 = CyK^ 1 , equation ([2]) is known as Hazen-Dupuit-Darcy equation 0. Although 
the Forchheimer equation is widely used in porous media simulations, recent studies suggest that 
the correction ([2]) has also a limited range of applicability (4] [5]| . A recent experimental study of 
Bagci et al. 0 showed that the fi factor which is measured in the laminar regime is not generally 
valid for the whole range of the fluid flow. In Bagci et al. 0 the use of two different values for 
permeability is proposed, together with distinct values f3 for laminar and turbulent flow through 
the spherical particles. In particular, in the range Re p > 300, the flow becomes turbulent and 
significant deviations from the Forchheimer model with the laminar /3 can be observed. 

Based on a series of experiments, Barree and Conway 0 proposed another model that has 
the same structure as the Darcy model {T} but which replaces the absolute permeability Kq by an 
apparent permeability K &vp . The apparent permeability can be measured in the same way as the 
Darcy permeability, but it depends on the flow in a non-linear fashion and is approximated as 


A ap p — m i n 


(Kp - ^min) 

(1 +Re T F ) E ' 


( 3 ) 


Here, E and F are exponential coefficients that describe the heterogeneity of the porous medium, 
and Re T = pl T U/p is the Reynolds number based on a transition length scale l T , and K mi „ is a 
minimum permeability that is attained at high Reynolds numbers. This model is proposed based 
on the conjecture of two plateau areas for permeability at low and high Reynolds number. While 
the plateau areas at low Reynolds numbers corresponds to the Darcy regime, and the prediction of 
the Barree-Conway model in the transition region shows good agreement to experiments mm, 
the plateau at higher Reynolds numbers is largely hypothesized. To the best knowledge of the 
authors, no experimental setup or numerical simulation was published that exhibits the plateau 
behavior at high Reynolds numbers. 

While the rigorous, analytic up-scaling of the pore-scale problem at lower Reynolds numbers 
has received much attention in the literature i0, similar approaches for higher Reynolds 
numbers have not been demonstrated to date. This is mainly due to the immense mathematical 
difficulties that arise in the flow models if a moderate Reynolds number cannot be assumed. 

However, with the rise of extremely capable supercomputers, the direct numerical simulation 
(DNS) has been established as a third possibility for the analysis of homogenized models that can 
complement the classical experimental and the rigorous mathematical averaging approaches. In 
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the past two decades, the class of lattice Boltzmann methods (LBM) has attracted the interest 
of researchers in CFD-related areas. In contrast to traditional CFD approaches based on the 
conservation of macroscopic quantities like mass, momentum, and energy, the LBM models the 
fluid by the kinetics of discrete particles that propagate (streaming step) and collide (relaxation 
step) on a discrete lattice mesh. Due to this kinetic nature, microscopic interactions in the 
fluid flow can be handled even in complex geometries, such as in micro-fluidic devices or 
in porous media maun ns. Moreover, due to the inherently local dynamics, an efficient 
implementation and parallelization of both fundamental algorithmic stages of the LBM is possible 
which allows to harness the computational power of currently available and emerging super¬ 
computing architectures 111 311 141115H . In this study, we use the waLBerla framework (widely 
applicable Lattice-Boltzmann from Erlangen) Q3 that is aimed at massively parallel fluid flow 
simulations, enabling us to compute problems with resolutions of more than one trillion (10 12 ) 
cells and with up to 1.93 trillion cell updates per second using 1.8 million threads oa. waLBerla 
has already been used to study the flow through moderately dense fluid-particle systems in ifTTIl 
and to simulate large scale particulate flows DU- It is based on four main software concepts, 
namely blocks, sweeps, communication and boundary handling lfl9l . The modular structure of 
this framework allows the implementation of new LBM schemes and models. 

However, having immense computational power at hand is not enough to solve relevant 
problems: due to the explicitness of classical LBM methods the spatial and temporal discretization 
characteristics are strongly coupled. Hence, special care has to be taken for pore-scale simulations 
to properly incorporate the physics at the boundaries and inside the domain without over-resolving 
the problem. As already pointed out in the evaluation of Pan et al. j20l . this requires a suitable 
combination of collision and boundary operators. A common way to simulate pressure driven flow 
is to replace the pressure gradient with an equivalent body force and applying stream-wise periodic 
boundary conditions. However, previous studies ETl l22l l23l l24ll show that using this approach 
does not lead to correct flow fields for the flow through complex geometries, e.g., in general 
porous media. In this article we drive the flow by imposing the pressure gradient while applying 
the periodic boundary condition in stream-wise direction and allow the flow to develop based on 
the geometry. In this study, we employ a simple periodic pressure scheme for incompressible flow 
and investigate it in combination with different types of collision operators and a number of first 
to third order bounce back schemes for the treatment of the boundaries. 

To model porous media, we simulate the flow through simple sphere packs by a number of 
different LBM approaches which we briefly outline in Section [2] Their accuracy, convergence 
and the associated computational costs are investigated in Section [3] for flow in the Darcy regime, 
where semi-analytic solutions exist to compare with. Also the scalability of different boundary 
schemes is investigated to choose the best combination for the high resolution simulation in high 
Reynolds numbers flow. Having found a suitable configuration, we then simulate the flow through 
a simple sphere pack in Section^ By sampling over the regime Re p e [ I0 -4 ,5.812 ■ 10 3 ], we 
numerically investigate the plateau hypothesis of the Barree and Conway model in the turbulent 
regime. Additionally, we measure the Forchheimer constant C/ for laminar and turbulent regimes. 
To highlight the ability of the proposed LBM to deal with complex geometries, in Section [5] we 
also present an illustration of turbulent flow over a permeable wall which is constructed by 
rigid-body interaction of the spheres. 
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2. LBM approaches for porous media flow 

The lattice Boltzmann equation (LBE) is a simplification of the Boltzmann kinetic equation 
where we assume that particle velocities are restricted to a discrete set of values e„ i = 0,1,..., 
i.e., the particles can only move along a finite number of directions, connecting the nodes of a 
regular lattice Ii25ll26 l. In the following, we consider a three dimensional 19-direction lattice, the 
so-called D3Q19 model, which provides a good compromise between computational accuracy and 
parallel efficiency. Discretizing in time using a time-step size of At = t n+ \ - t„, the semi-discrete 
LBE then reads as 


At 1 ( f k (x + e k At, t n + 1 ) - f k (x, t n )) = g k (x, t n ), k = 0,..., 18, (4) 

where f k (x, t„ ) represents the probability of finding a particle at some position x and time f„ with 
velocity e k . The left hand side of © corresponds to a discrete representation of the Boltzmann 
streaming operator, while the right hand side g := [g&]| R _ 0 is responsible for controlling the 
relaxation to a local equilibrium. It is generally split into g := Ar~'£l(x, t„) + F(x, f„), where 
fl(x,t„) is a collision term function of f := [/■]**„, and F is a forcing term that drives the 
flow. Provided that the modeled fluid is close to its equilibrium state, Bhatnagar, Gross and 
Krook (BGK, [[27)) proposed that the discrete local equilibria in collision processes can be 
modeled by a second-order expansion of a local Maxwellian. The collision operator takes the 
general form 

Q(x, t n ) = -R(f(x, t n ) - f eq (x, t n )), (5) 

where R is a relaxation operator and f eq (x, t„) is an equilibrium distribution function of f(x, t„) 
which, for incompressible flow, is given by lt28l 

f k q (x, t n ) = w k [p + po [c; 2 e t ■ u + \c~ A (e k ■ u) 2 - jcJ 2 u ■ u]}, (6) 

where w k is a set of weights normalized to unity, p = po + bp while dp is the density fluctuation 
and po is the mean density which we set to po = 1, c s = Ax/( V3Af) is the lattice speed of sound, 
while Ax is the lattice cell width. The macroscopic values of density p and velocity u can be 
calculated from f as zeroth and first order moments with respect to the particle velocity, i.e., 

Z 18 * 118 

k=o f k , u=Po L k=0 ekfk - ™ 

In a lattice Boltzmann scheme, we typically split the computation into a collision and streaming 
step, which are given as 

fk(x, t„) - f k (x, t„) = At g k (x, t n ), (collision) 

fk(x + e t Af, t n+ 1 ) = / A .(x, t„), (streaming) 

respectively, for k — 0,..., 18. The execution order of these two steps is arbitrary and may vary 
from code to code for implementation reasons. For instance, in waLBerla, the order is first 
streaming and then collision. This has the benefit that the stream and collision steps can be fused 
and that the macroscopic values need not be stored and later retrieved from memory. 

Two important components must be investigated whenever one aims to study the numerical 
properties (i.e., stability, convergence and accuracy) as well as the computational cost-efficiency 
of an LBM based approach: Firstly, we have to consider the treatment of the collision term Q(x, t) 
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in the relaxation step. Secondly, since the nodes of the lattice in the LBM approach are only 
distinguished into fluid and solid nodes, the choice of the bounce-back operator for the fluid-wall 
interaction is important for the numerical accuracy. 

Unfortunately, the relaxation and the bounce-back cannot be treated as two independent 
ingredients, since their interplay is essential for finding a sweet spot between computational effort 
and physical accuracy. For instance, in the context of porous media flow, using the BGK collision 
operator in an approach with a single relaxation time (SRT) a >, i.e., Rsrt := oA is reported to suffer 
from numerical instabilities and moreover exhibits unwanted boundary effects G0l|29l l. More 
precisely, the effective modeled pore-size by the numerical scheme becomes viscosity-dependent. 
On a macroscopic level this has the effect that the calculated permeability in the Stokes flow is 
slightly viscosity-dependent while physically it should be independent of the fluid properties 
||20| . The limitations of this simple and computationally appealing approach outlined above are 
the motivation for the investigation of more sophisticated schemes. Similar as in Pan et al. Il20l . 
our aim is to investigate different LBM approaches with the application of porous media flow in 
mind. However, we put our focus on boundary conditions which are suitable for flow driven by 
pressure-gradients instead of lumping these boundary conditions into a body-force. 

Before we present a detailed evaluation of suitable combinations for pore-scale simulations, 
let us briefly summarize the different approaches. 

2.1. Collision operators 

In the evaluation of Pan et al. f20l it is reported that by using multiple relaxation time 123 ED 
(MRT) schemes one can overcome the deficiencies of the SRT approach when applied to porous 
media flow. In the MRT model, the relaxation times for different kinetic modes can be treated 
separately, which allows fine-tuning of the free relaxation-times to improve numerical stability 
and accuracy. This is done by the ansatz 

R MRT :=M r SM, (MRT) 

where the components of the collision matrix in the MRT are developed to reflect the underlying 
physics of collision as a relaxation process. The rows of the orthogonal matrix M are obtained by 
the Gram-Schmidt orthogonalization procedure applied to polynomials of the Cartesian compo¬ 
nents of the particle velocities e*, and S := diag[«o, s \,..., sig] is the diagonal matrix holding the 
relaxation rates s*; cf., e.g., d’Humieres ED for details. Clearly, within this class of relaxation 
schemes, SRT approaches are contained as a special case. The parameters sq , S 3 , S 5 and S 7 are the 
relaxation parameters corresponding to the collision invariant p, and j := pu, respectively, which 
are conserved quantities during a collision. They are set to zero in the absence of a forcing term, 
i.e., if Fk = 0. Pan et al. If20l proposed to choose the remaining modes as follows: (i) viscous 
stress vectors sg, Sn, S 13 , ,V | 4 and si 5 equal to s v , which is related to the kinematic viscosity 
v = p/p as s v = (3v + 0.5) -1 , and (ii), set the kinetic modes sq, sq, S 4 , s&, s&, S 10 , in, si6, S 17 , and 
in to s^ = 8(2 - s r )(8 - Jy) -1 , which is also related to the kinematic viscosity. However, as their 
evaluation showed, it does not lead to viscosity-independent results of the macroscopic quantities, 
except for the multi-reflection boundary scheme. As Khirevich et al. l32l recently stated, this 
problem can be solved if one modifies the aforementioned model in a way that the symmetric 
energy modes, i.e., ij and S 2 , keep the ratio (l/s v - 0.5)/(l/s, - 0.5) constant while s v varies. In 
this study, we set this ratio to 4.6 which provides better accuracy. 

Based on a symmetry argument, Ginzburg |29l proposes a model based on two relaxation 
times (TRT), which can be seen as the minimal configuration that provides just enough free 

5 


relaxation parameters to avoid non-linear dependencies of the truncation errors on the viscosity 
in the context of porous media simulations 1(331134 , 351. The scheme is derived from the MRT 
approach by splitting the probability density functions into the symmetric and anti-symmetric 
components ff := + f- k ) and f k := \(f k - ff), where k is the diametrically opposite direction 

to k [33; 34};35|. Performing a separate relaxation by the two corresponding relaxation rates a> + 
and a> yields the operator 

R trt := w + R + + w R , (TRT) 

where R + and R are the tensorial representations of the operators extracting the symmetric and 
antisymmetric components, respectively. The eigenvalue of the symmetric components is again 
related to the kinematic viscosity as u> + = (3v + 0.5)~\ and the second eigenvalue aT e (0,2) is a 
free parameter. For steady non-linear flow situations, it has been demonstrated that most of the 
macroscopic errors depend on the so-called “magic” parameter 

a = (2— -1(4— i 

\ U) + 2 ]\(jj 2 

which has to be determined for the specific flow setup. The choice A = | is given as a suitable 
value for porous media simulations . Another choice, namely A = ^, yields the exact location 
of bounce-back walls in case of Poiseuille flow in a straight channel lf34l[32l . In our studies, we 
choose different values in the range A e (0, |]. The exact numbers are listed in section [d] 

2.2. Boundary conditions 

In pore-scale simulations, we typically encounter two types of boundary conditions. The first 
one is a solid-wall interaction of fluid particles that come into contact with the porous matrix 
(corresponding to a no-slip condition in continuum-mechanics terminology). The second one is a 
periodic pressure forcing that is applied to drive the flow by a pressure gradient in order to compute 
fluid or matrix properties such as the Forchheimer coefficient or the (apparent) permeability. We 
shall next outline the different schemes under consideration for this study. 

2.2.1. Solid-wall interaction 

In a simple bounce-back (SBB) scheme, the wall location is represented through a zeroth order 
interpolation (staircase approximation), and the collision of particles with a wall is incorporated 
by mimicking the bounce-back phenomenon of a particle reflecting its momentum upon collision 
with a wall, which is supposed to happen half-way between a solid and fluid node. Hence, the 
unknown distribution function is calculated as: 


M*f, dn+1 ) = fkiXfi ,t„). (8) 

where we recall that k is the diametrically opposite direction to k, and we take the values f- after 
collision but before streaming on the right hand side. However, in complex geometries, the wall 
position is not always located half-way on a regular lattice. Hence, especially at coarse resolutions, 
this boundary treatment introduces severe geometric errors, which in turn lead to boundary layer 
effects that can be particularly problematic in porous media of low porosity, where large part of 
the domain is occupied by solid nodes. Here, a discretization that adequately resolve the solid 
boundaries are often computationally prohibitive since the meshes would get too large. Thus, 
several advanced schemes have been proposed to more accurately represent boundaries that are 
not mesh aligned using spatial interpolations ll36l (37l(38ll39ll40ll4ll . 
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Figure 1: Example of the different position of wall with different value of the q. For simplicity, we only display a 
two-dimensional illustration. 


In our numerical study, we compare five different interpolating boundary condition schemes 
to deal with curved boundaries, namely, the linear interpolation bounce-back (LIBB, If36l l. the 
quadratic extension (QIBB, BOl ). the interpolation/extrapolation bounce back (IEBB, l37l ). the 
multi-reflection (MR, l34l ) scheme and the central linear interpolation (CLI, H34I ) scheme. For 
the sake of completeness, let us briefly recall these different approaches. 

Let Xf denote a lattice node which is at a distance of at most i > 0 cells from the boundary 
and let q = |jcy, - x w \/\xf t - x b \ define a normalized wall distance; cf. Fig. |TJ Bounce-back 
schemes based on higher order interpolations require more than one fluid node between nearby 
solid surfaces. For instance, the LIBB condition proposed by Bouzidi et al. If36ll is given by 


_ f(l — 2 q)fk(x f2 , t n ) + 2 qf k (x fl , t n ), q < 1/2, 

MXf " " +1 “ 1(1 - %)Mx fl ,t n ) + ^fk(Xf , t n ), q > 1/2. 

One can also use quadratic interpolation to obtain the QIBB scheme I 36l i40l . where the unknown 
distribution function would be calculated as: 




(‘7(1+ 2 q)fk(x fl , t„) + (1 - 4 q 2 )fk(x fl ,t„)-q( 1 

n+\) ~ \ (2q-l\ f, x -1- f, 




While the use of extrapolation is also possible to obtain better accuracy. However, care must 
be taken to preserve numerical stability 1371 . An interpolation/extrapolation bounce-back (IEBB) 
model was proposed by Mei et al. (371 as an improvement to the scheme of Filippova and Hanel 
m, where a velocity at the wall is computed via 


_ K 2 , <7 <1/2, 

Ubf = {(l -£K, q > 1/2. 

to determine an auxiliary distribution function 

fk(x b ,t „) = Wi.jAp+pofJre* -u bf + 2pr(e A • u ^) 2 - |u (/i -u^]}, 

which is then used to linearly interpolate the unknown as: 

fk(Xf,, t„+ 1) = (1 - X)f k (Xf t , t„) + Xf k (x b , t n ). 
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( 11 ) 


( 12 ) 



























Above we let X = (2 q - l)/(l/w - 2) for q < while for q > | we let X = (2 q - l)/(l/w + 1/2), 
where we set a) = u> + in case of the TRT scheme and to = s v in case of the MRT scheme. 

Another alternative approach to obtain highly accurate bounce-back conditions is the multi¬ 
reflection (MR) scheme lf38l |34| , which reads as 

Mx fl ,t n+ 1 ) = Vi 2 + g 2 ~ 2 )f fk(Xf 2 ,t n ) + ^ fk(Xf 3 ,t n ) 

“ InW > tn) - fkiXfl’tn) + fkiXfi , t„). (13) 

Note that this scheme needs to access five distribution values at three fluid nodes for the update. 
A computationally cheaper variant is given by the central linear interpolation scheme (CLI) that 
only needs three values at two fluid nodes, i.e., 

M*fi . t n+ 1 ) = T^fk(x f2 ,t n ) - fk(x fl , t„) + f k (x h , t n ). (14) 

It should be noted that two latter schemes do not involve a distinction of cases for different values 
of q which allows for an efficient implementation. 

The CLI and MR schemes can be modified with a post collision correction fP c to remove the 
second order error for steady flows. The correction term is constructed based on anti-symmetric 
non-equilibrium of the TRT and MRT collision operators. However, since the correction is not 
suitable for unsteady flow, we do not consider it in this study. Interested readers are referred to the 
work of 01121 HD- 

2.2.2. Periodic pressure boundary condition 

In many applications, fluid flow is driven by a pressure difference. For incompressible flow, 
the corresponding periodicity boundary conditions can be written as 

u(x + Li,t) = u(x, t), p(x + Li, t) - p(x, t) + Ap, (15) 

where L is the length of the domain in the periodic direction i, and A p/L is the pressure gradient 
applied between the inlet and outlet boundaries of the domain. 

In LBM-based approaches, applying this type of boundary condition is not straightforward. 
Simply adjusting the corresponding pressures at the inlet and outlet boundaries produces non¬ 
physical mass defects at the periodic boundary, as was reported e.g. for Poisseulle flow in Dupuis 
1431 . The most commonly employed approaches replace the pressure gradient by incorporating 
an equivalent body force; see e.g. M 03 061 m 08). However, these approaches suffer from the 
inability to predict the pressure gradient accurately for flow situations in general geometries GU 
I22ll23ll24l . For porous media applications, where we face complex pore geometries it is therefore 
desirable to have a boundary condition that is not lumped into a volume forcing and hence does 
not rely on rough predictions of the pressure field. In this work, we employ a pressure boundary 
condition which can be applied for incompressible periodic flows. Here we specify the equilibrium 
distribution function / eq and the non-equilibrium distribution / neq = f - /. eq separately, as we 
shall describe below. As the extension to multiple periodic boundaries is straightforward, we 
consider in our description an essentially one-dimensional setting in which flow propagates from 
the left (L) to the right (R) boundary: 

Since the pressure is related to the density via the equation of state p = pc 2 s , we consider a 
density difference instead of pressure difference in the following. The density at the left boundary 
is obtained by 


Pl=Pr + Ap. 
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( 16 ) 






Figure 2: 3D view of simple sphere array with equal radius. 


Regarding to the relaxation dynamics of the non-equilibrium distribution, in presence of the 
periodic boundaries, it can be approximated as 


hi = fi,R- (17) 

Now by using the above formulations, the unknown distribution functions can be computed as 

fa = jg+J>L, UR), (18) 

09) 

Since the momentum in a periodic channel does not change, the implementation of this approach 
is simple. For instance, by considering & the update (p~8]> and (p~9]> can be performed as: 


fi,L = f,R + W k AP, 
fi,R = fiX - W k Ap. 


( 20 ) 

( 21 ) 


3. Pore-scale simulation at Re « 1 


To verify the implementation and to assess the quality of the different schemes before tackling 
more complex flow problems, we shall in the following first evaluate different combinations of 
the LBM strategies discussed in section[2]in the Darcy regime, and compare the results with a 
semi-analytic solution. We consider flow through a periodic array of spheres arranged on a regular 
lattice as depicted in Fig. [2] To quantify the errors, we compute the dimensionless drag coefficient 


c Fd 

D 671 /jU r 


( 22 ) 


where r is the sphere radius, and Fd is the drag force acting on the sphere. A semi-analytical 
reference solution Co,ref can be derived as a function of the porosity by a series expansion; for 
details on the calculation, we refer to ED. 

To save on computational cost we compute the wall-distances required for the higher order 
boundary conditions only once before the time-stepping loop starts and reuse it in each boundary 
handling step. This can be done since we consider a non-deformable porous medium. Moreover, 
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Figure 3: Logarithmic relative error of the dimensionless drag, \Cd/Cd, ref — 1| as a function of the sphere radius (in lattice 
units) for different boundary schemes and collision operators. 


we exploit the periodicity in the domain and consider only one cubic representative elementary 
volume (REV) containing a single sphere. For details on the boundary handling, we refer to the 
previous section |Z2| 

In the following, we shall summarize our results related to the undesired effect of viscosity- 
dependence of the schemes. However, to make sure that the measured errors are not resulting 
from other approximation issues, and to compare the accuracy in terms of resolution which can be 
obtained by the different schemes, we shall first conduct a grid-convergence study. 

3.1. Evaluation of the grid-dependence 

In this test, we change the size of the sphere and increase the computational domain ac¬ 
cordingly, such that the relative solid volume fraction is fixed to x — 0.6. We compute the 
dimensionless drag force Co and consider the relative error | Co/Coref — 1|. At the resolution 
where this quantity drops below 1% and stays below this limit on finer lattices, we call the result 
grid-independent. Our results are plotted in Fig. [3] 

It can be seen that using higher order boundary conditions generally increases the accuracy 
of the computed drag force. As Fig. [3(a)] indicates, all of the interpolation boundary schemes 
have second-order accuracy when combined with the SRT model. Moreover, for this example, we 
observe that the linear interpolation models, i.e., LIBB and CFI, show higher accuracy than the 
other schemes. For the CFI scheme for instance, we reach a grid-independent solution at r = 8.7 
in lattice unit. 

In Figures [3(b)| and [3(c)] we list the results for the TRT and MRT models, respectively. Here 
we observe a difference between the CLI and the MR schemes, i.e., the CFI scheme converges 
with second-order while the MR scheme converges with third-order. The results also exhibit that 
the IEBB scheme converges with at least second order. For the TRT model, the results for the 
IEBB scheme becomes grid-independent for r = 4.5, the MR with r - 5.1 and the CFI with 
r = 5.7. The MRT scheme also shows better accuracy than the SRT collision model but it is found 
to converge slower than the TRT model, which is additionally considerably less expensive in 
terms of computational effort. It is worth to mention here that the simulation results using the 
TRT and MRT collision can be different by changing the magic number. This is why the SBB 
scheme in the SRT collision gives better results than the TRT and MRT collision in this specific 
test for which the magic number is set to 1 /4 in which the interpolation boundary schemes results 
in better accuracy. 
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Figure 4: Relative error of the drag force C/j/C/),rct - 1 as a function of the sphere displacement (in lattice units) for 
different boundary schemes and collision operators for r = 4.5. 


In these experiments we also observe that the errors for most models do not smoothly decrease 
with respect to the resolution. Comparing for instance the linear LIBB model with the quadratic 
QIBB model reveals that the higher order scheme is less affected by this phenomenon. To 
investigate the source of these fluctuations, we conduct a second series of tests where the center 
of the sphere is shifted streamwise inside the cell. In all cases, the radius of the sphere is kept 
fixed at 4.5 lattice units and the relative volume fraction is again set to x — 0-6. In Fig. [4] 
we show the results for a displacement of 0.0 - 0.5 times a cell width. Although the results 
obtained by the SRT-SBB are relatively good for this setup, we observe that the SBB scheme is 
for all considered collision schemes more sensitive with respect to the positioning of the center. 
Since this behavior stems only from the geometry representation, the use of interpolation models 
significantly increases the reliability of the results. 


3.2. Effect of the viscosity on the computed permeability 

As a number of authors previously reported, the choice of boundary conditions and collision 
models may have a strong impact on the simulation accuracy I20ll29 , T71150 1. To exclude pre- 
asymptotic discretization error modes from our observations and only concentrate on the physical 
inconsistencies introduced by a specific combination of collision and boundary treatment, we 
use a sufficient resolution of 16.5 (in lattice units) for the sphere radius. Since the permeability 
in the Darcy regime is only a geometrical characteristic, it is used for comparison in this study. 
As proposed by Adler ETIl , we compute the average pressure gradient in the flow direction as 
VP • i = —Fd L 3 , where L is again the length of the domain in flow direction. By combining this 
and ( |22| with ([TJ, we find the reference permeability A^ef which we use as a reference. The plots 
of Fig. [5]display the ratio K/ K rcl of the computed permeability by the reference permeability for 
a simple sphere pack with a relative solid volume fraction of x — 0.6. We consider viscosities 
in the range [0.029,0.45] and compare different collision models and boundary conditions. In 
Fig. 5(a)| we show that for all boundary schemes, the SRT collision results in a permeability that 
is strongly depending on the viscosity. While the IEBB scheme is less sensitive than the other 
schemes, the errors we can expect from an SRT collision model are still unacceptable for our 
purposes. The results of Figures 5(b) and 5(c)| depict the same set of experiments conducted 
with the TRT and MRT models, respectively. In these plots, we observe that the SBB, the IEBB, 
as well as the MR, and the CLI schemes are nearly viscosity independent. However, the SBB 
severely under-predicts the permeability due to its staircase representation of the geometry. These 
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Figure 5: Viscosity dependence of the permeability for the solid volume fraction of 0.6. The results represents the 
normalized permeability as K/K re f for different boundary conditions. 


results are in line with those found in the related work of Pan et al. It20l . When comparing only 
the two linear boundary schemes, namely, the CLI and the LIBB, we observe that the LIBB fails 
to produce viscosity-independent results with all collision models under consideration, while the 
CLI has much better properties in this respect. 

3.3. Evaluation of the computational cost 

Our results obtained in section [L2| and |3TT| indicate that the TRT and MRT collision operators 
in combination with the MR boundary scheme results in the most accurate solutions. Given that 
the highly optimized split kernel of the TRT implementation in waLBerla is about as fast as the 
SRT model, there is no justification to use the MRT scheme, which is around a factor of two more 
expensive. 

While the question about the favorable collision operator is rather easy to answer, the choice 
of the right boundary treatment is less obvious. Note, that especially for a DNS of turbulent 
flow, where we have to scale the number of unknowns as Re <>/4 to resolve the smallest dissipative 
scales, we cannot avoid massive parallelism. Unfortunately, the more accurate boundary handling 
methods must access LBM nodes from up to three cells away from the fluid particle boundary, 
see Sect. |2.2| In a massively parallel setting, such physical boundary points may be near a 
logical processor boundary that has been created by the domain partitioning for a distributed 
memory machine. In waLBerla this situation is handled by extra ghost-layer exchanges, i.e. by 
communicating an extended set of distribution functions to neighboring processors along the 
subdomain boundaries. The data dependencies also cause additional synchronization overhead 
in the parallel execution. In a weak scaling scenario with few spheres embedded in the flow 
and where large subdomains can be handled on each processor, we have measured that the more 
advanced boundary schemes may already cause a slowdown of by 10% to 50% compared to the 
SBB method. 

However, for our computational objectives, more complex geometric configurations must 
be considered. Then the communication of data quickly becomes the critical bottleneck. For 
our application, it is particularly important to save on communication since for production runs 
many timesteps are necessary until the desired solution is found. Thus, to keep the compute 
times acceptable, only moderately large blocks of LBM nodes can be assigned to each processor. 
Consequently, the ratio between computation and communication is already a bottleneck despite 
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the highly optimized waLBerla program and even when just one layer of ghost nodes is exchanged 
in every time step. This means that our typical simulation runs are already close to the strong 
scaling limit of the parallel execution, as analyzed in US¬ 
As a good compromise between the cost (including communication on parallel computers) 
and numerical accuracy, we choose here the TRT-CLI scheme for large pore-scale problems. It 
leads to a reasonably good accuracy, has no viscosity-dependence, and needs less communication 
than the MR scheme. Hence, using the slightly less accurate but significantly better parallelizable 
method results in a considerable reduction of run-time. A more concise quantitative analysis of 
the performance trade offs, will require the development of sophisticated parallel performance 
models following the techniques of Godenschwager et al. Ifl6l . but this is beyond the scope of 
the current paper. Godenschwager et al. lfi~6l already presents program optimization for LBM 
simulations in the case of simple boundary conditions, as they are being used in the waLBerla 
software framework. 

4. Pore-scale simulation at Re > 1 

In this section, we present a series of pore-scale simulations for a Reynolds number ranging 
over more than seven orders of magnitude, i.e., Re p e [ 1 O' 4 ,5.812 • 10 3 ]. We conduct a direct 
numerical simulation through the simple sphere pack using the periodic-pressure boundary 
treatment combined with the TRT-CLI scheme. 

To avoid grid-dependencies as well as instabilities resulting from under-resolved turbulence, 
we conduct a grid-independence study for six different Reynolds number ranges. We investigate 
the dependency of the drag coefficient, i.e. Fo/pAU 1 while A represents the cross sectional area, 
on the resolution. To keep the Re number of the flow the same, we adjust the pressure drop, as 
well as the viscosity while changing the resolution. In Table |T[ we list the resulting resolutions 
which were identified as sufficiently resolved for our purposes. Also, the magic A value is shown 
for different flow regimes. We consider a solution as converged if the computed time-averaged 
drag coefficient stops changing. The averaging will be started when the flow is developed, which 
usually takes around 400-600 flow-through times to happen. 


Table 1: The resolution and A value which are chosen for different range of the Re p numbers 


range 

D/Ax 

A 

Re p < 0.01 

23.4 

1/4 

0.01 < Re,, < 198 

37.8 

3/16 

198 < Re p < 509 

59.4 

1/12 

509 < Re p < 1008 

70.2 

io - 5 

1008 < Re p < 1617 

102.6 

io - 5 

1617 <Re p < 5813 

145.8 

6 x 10~ 6 


Fig. [6] depicts the streamlines that are plotted in two perpendicular planes based on an 
instantaneous velocity field at a dimensionless time t* = f x v/D 1 = 8.88 . The plots are presented 
in three views of side, top and stream-wise views, respectively. Fig. |6(a)| presents side view of 
the streamlines of the simulation in Re < 1. We can observe the development of steady boundary 
layers near the solid boundaries and symmetric streamlines before and after the sphere. This 
behavior can be observed from the other views as well. Fig. 6(c) shows that the streamlines that 
are plotted on the XY and XZ planes stick to the plane. 
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As the Reynolds number increases, we can see the onset of inertial effects caused by the 
acceleration and deceleration of the fluid passing through the pores. In Fig. |6(d)| we see that small 
vortices start to form in front of the sphere and some larger ones behind the sphere. In this regime, 
the form drag is added to the viscous drag which increases the head loss. A further increase of 
the flow rate yields that these two types of vortices join to an even larger vortex occupying the 
distance between two spheres, which results in a lower flow capacity; cf. Fig. |6(g)[ The flow 
regime is still laminar and steady. 


- 
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(b) Re p = 0.01,XZ 

(c) Re p = 

0.01, YZ 
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(f) Re p 

= 46, YZ 
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(h) Re p = 79, XZ 

(i) Re p 

= 79, YZ 


Figure 6: Streamlines of the time-averaged flow through the simple sphere pack for linear and nonlinear steady flow at 
X = 0.6 with Re p = 0.01, 46, 79. 


The acceleration/deceleration effects described above are the dominant phenomenona until 
the flow starts to fluctuate. As we can see in Fig. 7(a) at around Re p - 180 the flow is still 
laminar, but the symmetry breaks and the flow becomes unsteady. We can see the flow separation 
behind the sphere and the boundary layer interaction that causes energy dissipation in the unsteady 
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(a) Re p = 183 


(b) Re p = 509 



Figure 7: Velocity field of the flow through the simple sphere pack for non-linear flow at X = 0.6 with Re p = 183, 509, 
762,1008,3880,5812,. 

laminar flow regime. This is in line with the observations of previous studies, where the onset 
of fluctuations is reported between Re p = 110 and Re p — 250 for different porous media 0- 
However, the starting Reynolds number of the fluctuation strongly depends on the porosity, the 
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arrangement of the packing and the size of the spheres. 

By further increasing the Reynolds number of the flow, we observe the onset of turbulent 
chaotic behavior; cf. Fig. 7(b) and |7(c) The flow is in the transition regime from the laminar 
unsteady flow to turbulent flow. We can also observe smaller vortices that can pass through the 
pores and thereby increase the non-linear effects. Increasing the Reynolds number of the flow 
even further, we can see turbulent flow behavior. Fig. |7(e)| and |7(f)| are showing a strongly 
chaotic flow field and random behavior. 

Based on the above mentioned results of the pore scale simulation, we compute the Forch- 
heimer constant Cf, using equation | 2 } and also the normalized apparent permeability by the 
Darcy permeability as defined by Barree-Conway model ®,i.e„ 


K* := 


A a pp 

~Kd' 


(23) 


where AT app is determined using the one-dimensional equivalent of Darcy’s law 0 by means of 
a spatially and temporally averaged velocity. The temporal averaging starts after the drag force 
acting on the sphere is converged. Once converged, we average the volume averaged velocity for 
a period of 100-150 flow through times. 

Our results are presented in Fig. [ 8 ]for the whole range of Reynolds numbers considered in this 
study. In the left figure, we compare our results to a best fit with respect to the Barree-Conway 
model, obtained for Reynolds numbers up to Re p = 1 000. We see that the results can be fitted 
well to the model equations which agrees with the experimental validations reported in (4j ( 6 ] EJ. 
However, for higher Reynolds numbers, we observe a significant deviation from the plateau region 
that is predicted by the model. It indicates that, while the model of Barree-Conway can be adjusted 
well to flow simulations in the range up to Re p = 1 000, it lacks enough degrees of freedom to 
model flow beyond that range. Including the additional data for high Reynolds numbers increases 
the fitting error in the low Reynolds regime, where the model has been validated by previous 
studies of various authors. Based on our observations, we moreover conclude that K„ u „ in the 
model does not have the physical meaning of a minimal permeability that is attained in the high 
Reynolds limit. It should rather be referred to as a free model parameter that can be chosen to fit 
the curves for a porous medium at hand in the regime Re p < 1 000. 

The right plot of Fig. [ 8 ] depicts that the value of C h strongly depends on the Reynolds number, 
which is in line with the experimental results of Bagct et al. 0 who stated that the /3 factors have 
to be chosen differently for different flow regimes. However, it can be seen that two approximately 
constant values can be considered for laminar and turbulent flow. In our setup these are 0.0078 for 
8 < Re < 79 and 0.023 for 762 < Re < 5812, respectively. The former is the non-linear laminar, 
the latter is for the turbulent regime. 

A non-dimensional form of equation ([2]) can be provided in form of a friction factor, i.e.. 


Fk = 77 — + F, (24) 

Re K 

where Fk = (A p/L) \[Kn/f>IJ 2 , and Re k = pU \J Knl p. Fig. [^presents the results of the friction 
factor for different Reynolds numbers. As it can be seen, for Re K < 1 the results are inversely 
proportional to Rex which is the typical behavior of the friction factor. The deviation starts at 
Re K = 5 (Re p = 20) which is the inertial regime of the flow. The transition regime from unsteady 
flow to chaotic flow is 70 < Rex <110 (180 < Re p < 300). By further increasing the Reynolds 
number, we can see the turbulent flow regime in which the friction factor converges to nearly a 
constant value which depends on the porous geometry, and in our simulation is 0.026. 
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Figure 8: Normalized permeability (K*, left) and Forchheimer constant (Cf , right) for pressure-driven flow with different 
Reynolds numbers through the simple sphere pack. 



Figure 9: Permeability-based friction factor versus permeability-based Reynolds numbers and particle diameter-based 
Reynolds numbers. 
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5. Turbulent flow over a porous bed 


By using the results of the evaluation which is presented in the previous sections, we provide a 
pore-scale simulation for semi-realistic porous structure consisting several spherical particles. To 
construct a semi-real porous structure, our in-house physics engine framework I2JED is used to 
simulate the rigid body interaction between spherical bodies. In this simulation, several spherical 
particles with different size are configured such that they fall down due to the gravity and onto 
the bed. After the particles come to rest, the geometry is imported as a permeable bed in the 
waLBerla framework for fluid flow simulation. 

To resolve all scales, the porous media part of the domain is refined using three different levels 
of refinement. Periodic boundary conditions have been used in the stream-wise and span-wise 
direction while the top boundary is free surface. We apply the TRT model of collision and the CLI 
scheme for wall boundary condition. The simulation is executed on the LIMA supercomputer, 
with 64 nodes and 24 cores on each node. With this configuration it roughly takes 24 hours of 
computation for the whole simulation for a spatial resolution of 9 x 10 7 cells. Fig. 
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shows 

the contours of the flow velocity of the simulation after 5 x 10 6 timesteps. The block structure 
used in waLBerla is shown on the left hand side of the figure to indicate the static refinement 
of the meshes. The grids are also displayed on the right hand side of the contour to show the 
grid resolution. We point out that the results are stored after coarsening the grids throughout the 
domain by a factor of 8 to keep the size of the resulting outputs reasonable for post-processing 
and visualization. 


6. Conclusion 

In this article, the Lattice Boltzmann method is used to realize methods to simulate the 
flow through periodic and random sphere packs for a large range of Reynolds numbers. The 
article studies different collision operators and different types of boundary conditions. A new 
periodic pressure boundary condition has been developed to drive the flow. While the SRT 
collision operator produces a viscosity dependent permeability that affects the accuracy of porous 
flow simulations, the MRT model reduces the viscosity dependency of the results significantly. 
However, the TRT collision model in combination with the MR or CLI boundary model can also 
lead to accurate viscosity-independent results and is computationally cheaper as the MRT scheme. 
The grid independence for low Reynolds numbers is verified by computing the drag force and 
permeability as compared to the analytic solution for different sizes of the spheres. The results 
indicate that using the CLI combined with the TRT leads to good accuracy in both evaluation 
criteria. Since the TRT in its optimized version results in a run-time similar the SRT collision 
model, and since using the MRT for low Reynolds number flow does not improve the accuracy 
significantly, the TRT collision model is used in the remainder of the article. Considering the 
accuracy versus computational cost in a massively parallel setting indicates that the CLI boundary 
scheme is a favorable compromise choice. 

The results for the permeability shows that, contrary to the Barree-Conway model, no plateau 
area for the permeability can be observed at high Reynolds number. Although these results show a 
change from concave downward to convex upward in the transition between the laminar unsteady 
flow and the turbulent flow, it decreases by increasing the Reynolds number which is in line with 
the theory derived from the Navier-Stokes equation. Considering the Forchheimer model for 
high Reynolds number flow, it is shown that two different values can be considered for laminar 
and turbulent regimes of the flow. Also the results show good agreement with the friction factor 
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Figure 10: Turbulent flow over a porous bed, left to right, the block structure, velocity contour and the grids, respectively. 
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models. Different flow regimes can be found for different Reynolds numbers when the square root 
of the permeability is considered as the characteristic length scale. The results demonstrate that 
the flow regimes can be defined as, creeping flow, laminar steady, laminar unsteady, chaotic flow 
and turbulent flow. 
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